In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded:[1] that is, they have heavier tails than the exponential distribution. In many applications it is the right tail of the distribution that is of interest, but a distribution may have a heavy left tail, or both tails may be heavy.
There are two important subclasses of heavy-tailed distributions, the long-tailed distributions and the subexponential distributions. In practice, all commonly used heavy-tailed distributions belong to the subexponential class.
There is still some discrepancy over the use of the term heavy-tailed. There are two other definitions in use. Some authors use the term to refer to those distributions which do not have all their power moments finite; and some others to those distributions that do not have a variance. The definition given in this article is the most general in use, and includes all distributions encompassed by the alternative definitions, as well as those distributions such as log-normal that possess all their power moments, yet which are generally acknowledged to be heavy-tailed. (Occasionally, heavy-tailed is used for any distribution that has heavier tails than the normal distribution.)
Contents |
The distribution of a random variable X with distribution function F is said to have a heavy right tail if[1]
This is also written in terms of the tail distribution function
as
This is equivalent to the statement that the moment generating function of F, MF(t), is infinite for all t > 0[2].
The definitions of heavy-tailed for left-tailed or two tailed distributions are similar.
The distribution of a random variable X with distribution function F is said to have a long right tail[1] if for all t > 0,
or equivalently
This has the intuitive interpretation for a right-tailed long-tailed distributed quantity that if the long-tailed quantity exceeds some high level, the probability approaches 1 that it will exceed any other higher level: if you know the situation is bad, it is probably worse than you think.
All long-tailed distributions are heavy-tailed, but the converse is false, and it is possible to construct heavy-tailed distributions that are not long-tailed.
Subexponentiality is defined in terms of convolutions of probability distributions. For two independent, identically distributed random variables with common distribution function the convolution of with itself, is defined, using Lebesgue-Stieltjes integration, by:
The n-fold convolution is defined in the same way. The tail distribution function is defined as .
A distribution on the positive half-line is subexponential[1] if
This implies[3] that, for any ,
The probabilistic interpretation[3] of this is that, for a sum of independent random variables with common distribution ,
This is often known as the principle of the single big jump[4].
A distribution on the whole real line is subexponential if the distribution is[5]. Here is the indicator function of the positive half-line. Alternatively, a random variable supported on the real line is subexponential if and only if is subexponential.
All subexponential distributions are long-tailed, but examples can be constructed of long-tailed distributions that are not subexponential.
All commonly used heavy-tailed distributions are subexponential.[3]
Those that are one-tailed include:
Those that are two-tailed include: